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ABSTRACT 

An algorithm for generating deep-layer mean temperatures from satellite-observed microwave observations 
is presented. Unlike traditional temperature retrieval methods, this algorithm does not require a first guess 
temperature of the ambient atmosphere. By eliminating the first guess a potentially systematic source of error 
has been removed. The algorithm is expected to yield long-term records that are suitable for detecting small 
changes in climate. 

The atmospheric contribution to the deep-layer mean temperature is given by the averaging kernel. The 
algorithm computes the coefficients that w ill best approximate a desired averaging kernel from a linear combination 
of the satellite radiometer's weighting functions. The coefficients are then applied to the measurements to yield 
the deep-layer mean temperature. Three constraints were used in deriving the algorithm: 1 ) the sum of the 
coefficients must be one, 2) the noise of the product is minimized, and 3) the shape of the approximated 
averaging kernel is well behaved. Note that a trade-off between constraints 2 and 3 is unavoidable. 

The algorithm can also be used to combine measurements from a future sensor [i.e., the 20-channel Advanced 
Microwave Sounding Unit ( AMSU)] to yield the same averaging kernel as that based on an earlier sensor [i.e., 
the 4-channel Microwave Sounding Unit ( MSU )]. This will allow' a time series of deep-layer mean temperatures 
based on MSU measurements to be continued with AMSU measurements. The AMSU is expected to replace 
the MSU in 1996. 


1. Introduction 

For long-term monitoring of temperature change, 
deep-layer mean temperatures derived directly from 
satellite observations of upwelling radiance have an 
advantage over traditional operational temperature re- 
trievals. The advantage is that unlike operational re- 
trieval algorithms (Eyre 1989; Fleming et al. 1988; 
Goldberg et al. 1988; Hayden 1988) an algorithm for 
deriving deep-layer temperature directly can be made 
independent of a first guess of the ambient temperature 
profile. Operational retrievals are dependent on a first 
guess because the satellite observations alone do not 
have the vertical resolution to yield pointwise temper- 
atures, which are needed for forecast models. Unfor- 
tunately, the error between the first guess and the true 
ambient condition is systematic and, furthermore, the 
error cannot be entirely removed by the retrieval pro- 
cess (Thompson and Tripputi 1994). Since significant 
climate change on a global scale can be on the order 
of only tenths of a degree, temperature products in- 
dependent of a first guess are a step in the right direc- 
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tion. First guess independency provides certainty that 
any observed trends in the data are not due to errors 
in the first guess, which could very well have its own 
interannual variation. Deep-layer mean temperatures 
are appropriate for long-term monitoring of temper- 
ature trends because nearly all climate models have 
indicated that climate changes will occur over deep 
layers and not at isolated levels (Mitchell et al. 1990). 

The utilization of measurements from the Micro- 
wave Sounding Unit (MSU), on board NOAA's op- 
erational polar orbiting satellites, has gained much rec- 
ognition during the past few years as a measure of deep- 
layer mean temperature for long-term monitoring of 
climate change (Spencer and Christy 1992a,b, 1993; 
Spencer et al. 1990). Because radiance in this spectral 
region is extremely linear with respect to temperature, 
the observations can be interpreted as deep-layer mean 
temperatures for the layer defined by the weighting 
function. This is not true for the infrared spectral re- 
gion, where temperature and radiance can be very 
nonlinear. Microwave observations are usually ex- 
pressed in brightness temperature, which can be ob- 
tained from radiance using the inverse form of the 
Planck function. 

The MSU has four channels measuring outgoing ra- 
diation at 50.31, 53.73, 54.96, and 57.95 GHz. Channel 
1 (50.31 GHz) has a large surface component and is 
generally not used for deriving temperature due to un- 
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certainty in the surface emissivity. The first MSU was 
launched in 1979, and to date, its replacements have 
provided nearly complete daily coverage of the earth 
by scanning across the orbital track at ± 47.35 degrees 
about nadir at approximately 9.47-degree increments. 
The MSU’s six view angles results in the projection on 
the earth of 1 1 fields of view (FOV ) for each scan line. 
The weighting functions for channels 2 through 4 at 
each of the six view angles are given in Fig. 1. The 
highest peaking group of weighting functions is for 
MSU channel 4, followed by MSU channels 3 and 2. 
The higher peaking weighting functions in each channel 
grouping are associated with larger off-nadir angles. 

Spencer and Christy ( 1992a), used MSU channel 2 
(53.73 GHz) brightness temperatures, adjusted to na- 
dir, to monitor temperature for the layer defined by 
the channel 2 weighting function on a 2.5° gridpoint 
scale with a monthly precision of better than 0.1 °C in 
the Tropics and to better than 0.2°C at high latitudes. 
These estimates of precision were arrived at through 
intersatellite comparisons and in comparisons with ra- 
diosondes. They conclude that “the satellite precision 
approaches that of individual radiosonde stations in 
their ability to measure monthly temperature anom- 
alies . ...” In terms of monthly, zonally averaged tem- 
peratures, they estimate their precision is of the order 
of 0.0 1 °C over a 1 0-year period. 

A deep-layer mean temperature from a single mi- 
crowave observation has the equivalent vertical reso- 
lution of the channel. Improved vertical resolution can 
be obtained by combining different channels. The layer 



FKi. 1. MSU weighting functions for channels 2, 3, and 4 at all 
view angles and Spencer s derived averaging kernel (dotted curve). 



Fig. 2. The influence of the gamma parameter 
on the shape of the averaging kernel. 


is now defined by the averaging kernel, which is simply 
derived from a linear combination of the weighting 
functions ( using the same coefficients used to combine 
the measurements). To remove the stratospheric com- 
ponent from MSU channel 2, Spencer and Christy 
( 1 992b ) combined channel 2 measurements at different 
viewing angles to create a more narrow averaging ker- 
nel, shown as the dotted curve in Fig. 1, than the raw 
nadir-viewing weighting function. It is interesting to 
note that the raw channel 2 time series for the period 
1979-90 showed a global warming trend of only 
0.01 5 °C per decade, while the combined-angle ap- 
proach yielded an increased global warming trend of 
0.032°C per decade. By combining different viewing 
angles, Spencer was retrieving additional information 
that a single channel at a common view angle was un- 
able to provide. The only a priori information required 
was knowledge of the weighting functions, which for 
the MSU is well known and can be derived from a 
standard atmosphere. Because the MSU weighting 
functions are very weakly dependent on temperature 
and moisture, a fixed set of coefficients can be used 
globally to derive the deep-layer mean. This is not true 
for infrared measurements; their weighting functions 
generally have a much greater dependency on the am- 
bient atmosphere. 

Spencer did not use an algorithm to determine the 
coefficients for his lower-troposphere deep-layer mean 
temperature. He used trial and error by visual inspec- 
tion of the averaging kernel to determine the appro- 
priate coefficients. This technique is acceptable when 
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considering a very few number of channels or angles. 
However, as the number of different channels and view 
angles increases, the determination of the coefficients 
to yield a desired averaging kernel becomes a formi- 
dable task. A quantitative retrieval algorithm is re- 
quired to optimally solve for the coefficients. The coef- 
ficients need to be optimal in the sense that the derived 
averaging kernel is well behaved and that size of the 
coefficients are constrained so that the noise of the 
product does not become large. 

The emphasis of this paper is to present an algorithm 
to derive deep-layer mean temperatures from micro- 
wave observations within the band 50-60 GHz. The 
algorithm, derived in section 2, computes the coeffi- 
cients needed to combine a set of channel weighting 
functions into a desired deep-layer mean averaging 
kernel. The deep-layer mean temperature is obtained 
by simply applying the coefficients directly to the ob- 
served brightness temperatures. Examples of averaging 
kernels from the MSU are given in section 3. We will 
also demonstrate that the MSU temperature time series 
that Spencer pioneered can be continued with the next 
generation of microwave sounders — the 20-channel 
Advanced Microwave Sounding Unit (AMSU) 
(Fischer 1987). The first AMSU is expected to be 
launched in 1996. This will be accomplished by con- 
straining the averaging kernel associated with the set 
of measurements from the AMSU instrument to be 
approximately equal to the averaging kernel associated 
with the set of measurements from the MSU instru- 
ment. 
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Fig. 4. Comparison of the boxcar-derived averaging kernel, based 
on MSU channel 2 at view angles 3 through 6 and Spencer’s derived 
averaging kernel (doited curve). Also shown are the coefficients, the 
starting and ending levels and pressures of the boxcar function, the 
sum of the square of the coefficients, the noise of the product, the 
value of the gamma parameter, the sum of the coefficients, and the 
integrated difference between the shape constraint and the derived 
averaging kernel. 


2. Algorithm 

Our algorithm for computing deep-layer mean tem- 
peratures and its corresponding averaging kernels is a 
specialized adaptation of the Backus-Gilbert theory 
discussed in Conrath ( 1972). The Conrath paper dis- 
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Fig. 3. The relationship between y (gamma) and product noise 
and the required sample to reduce the product noise to 0.1 K. 


cusses the trade-off between instrumental noise and 
the vertical resolution of the averaging kernel for a given 
atmospheric level and set of measurements. The der- 
ivation of our algorithm begins with the same basic 
definition of the averaging kernel used by Conrath. 
However, our approach differs from Conrath with re- 
spect to application and constraint. Conrath’s con- 
straint is to derive coefficients that, when applied to 
the weighting functions, attempt to reproduce the ideal 
dirac delta function. In other words, he is trying to 
obtain the highest-resolution averaging kernel possible, 
cognizant of the effects of instrumental noise, for a 
particular level in the atmosphere. This approach is 
very useful for comparing the resolving power of cur- 
rent and future sounders. On the other hand, our con- 
straint is to yield coefficients that will reproduce a pre- 
specified averaging kernel. Our averaging kernel, unlike 
ConratlTs, is not associated with a given level. Instead 
it is “predesigned” to correspond to a desired deep- 
layer mean temperature t L derived from a linear com- 
bination of n measured brightness temperatures T,. 
That is. 


U. ~ C\ Ti + * • • + c„ T n , ( 1 ) 

where the c, are the coefficients of the linear combi- 
nation. 
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a. Algorithm constraints 

To optimize the coefficients in ( 1 ) for a given at- 
mospheric layer and a given set of channels and viewing 
angles, three constraints have been imposed, which now 
are explained in detail. The first constraint requires 
that the sum of the coefficients is unity. Since t L of ( 1 ) 
can be interpreted as a weighted average of brightness 
temperatures, the weights must be normalized by con- 
straining the coefficients to have sum one; that is, 

C\ + • • • + c n = 1 . ( 2 ) 

Thus, if all n of the T { in ( 1 ) are identical, then (2) 
guarantees that t L will have that same value. Since the 
Tj are normalized so that a constant shift of one degree 
in the temperature profile will result in a shift of one 
degree in the 7/, this constraint will ensure that t r has 
the same property. 

The second constraint addresses the problem that 
each of the brightness temperatures used in ( l ) carries 
with it a measurement error. Let a] be the variance of 
the error associated with T, and let a 2 be the variance 
of the total error associated with t L . It is well known 
that with independence of the individual errors the re- 
lationship between the total error variance and the in- 
dividual error variances is given by 



Fig. 6. Gaussian -derived MSU averaging kernels using 
MSU 2. 3, and 4 at all view angles. 


a 2 = c] a] + * * ■ + c\ a 2 n . (3) 

Consequently, to minimize the magnitude of o 2 we 
require as a second constraint that the sum of (3) be 
a minimum. 
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Fig. 5. Boxcar-derived averaging kernels using 
MSU 2, 3, and 4 at all view angles. 


For the third constraint one must determine the 
coefficients c, of ( 1 ) in such a way that the deep-layer 
mean averaging kernel agrees with the desired averaging 
kernel as close as possible. The manner in which the 
averaging kernel is defined is through the weighting 
functions wv(a) associated with the ith channel and 
which are the components of the kernel function in 
the radiative transfer equation. Thus, the layer over 
which t L of ( 1 ) is defined is given by the so-called “av- 
eraging kernel,” given by the linear combination 

a{x) = ci wi (a) +•••+<:■„ w n (a). (4) 

Equation (4) follows directly from ( 1 ). Note that x 
can be any monotonic function of the atmospheric 
pressure p. The purpose in making w, a function of a, 
instead of p directly, is that by judiciously choosing 
the transformation from p to a, one can shape the 
weighting function to suit specific needs. It also has 
the property that the sum of w,(a) over the range of a 
is unity. Because of the first constraint, the sum of a{ a) 
over the range of a is also unity. The values of the 
algorithm-derived averaging kernel represent the true 
weights of the contribution of the unknown tempera- 
ture profile to 

Note that the first two constraints were also used by 
Conrath (the first for a different reason ). It is the third 
constraint and how we treat it that provides the major 
relevance of this work. 

h. Coefficient determination 

Determination of the coefficients in the linear com- 
bination ( 1 ) of brightness temperatures, having the 
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Fig. 7. The relationship between the 1 1 MSU beam positions 
and the ten deep-layer mean temperatures. 


three properties discussed above, is now considered. 
We begin by letting c and T be the vectors of coefficients 
and brightness temperatures in ( 1 ), respectively, and 
define the a? - dimensional vector 

u ~ [ 1 * * * * » \]\ (5) 


where the transpose superscript T is used because all 
vectors are assumed to be column vectors. Then ( 1 ) 
can be written 


t>. = C T T. 

(6) 

and (2) can be written 


u T c = 1. 

(7) 

If we let 


D = diag( erf , • • -,al) 

(8) 


be an n -dimensional diagonal matrix whose diagonal 
elements are those indicated, then (3) can be written 

a 2 = c T Dc. (9) 


Furthermore, if we let W be a matrix of weighting 
functions with dimensions channel (n) by level (y), 
then the averaging kernel a(x) of (4) can be written 
as the 7-dimensional vector 

a = W T c. (10) 

Next, a shape vector b of j elements (i.e., the desired 
averaging kernel ) is defined to constrain the shape 
of the resulting averaging kernel. The coefficient 


vector c is determined in such a way that the shape 
of a, given by (10), approximates the shape vector 
as closely as possible. To do this, we minimize the 
squared distance between the vectors a and b, while 
at the same time satisfying the constraint (7) and 
minimizing (9). 

We now are ready to determine the coefficient vector 
c by optimizing our solution with respect to the three 
properties just discussed. This is accomplished by first 
establishing a cost, or penalty, function F, which in- 
corporates all three constraints. In its most general form 
the cost function is 

F( c) = (W T c - b) T S(W T c - b) 

4- yc T Dc T 2\( 1 - u l c), (11) 

where X and y are Lagrange multipliers and S 
is an arbitrary symmetric, positive definite (usu- 
ally diagonal) matrix of dimension 7X7. Note 
that the three terms on the right-hand side of ( 1 1 ) 
represent, respectively, the shape constraint of ( 10) 
minus the shape vector b, the error variance con- 
straint (9), and the coefficient normalization con- 
straint (7). 

To find that vector c, which minimizes /\ we dif- 
ferentiate F with respect to c and equate the result to 
zero. This yields 

2WS (W T c — b) T 2yDc - 2Xu = 0, ( 12) 

which implies that 

c = (WSW T + yD) 1 ( WSb + Xu), ( 13) 
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where the inverse matrix is well defined because W has 
dimensions n X 7, with n < J and rank n , and D and 
S are positive definite. 

One solves for the scalar X by multiplying (13) by 
u T and using ( 7 ) to obtain 

X = [1 - u T (WSW T + 7 D) 1 (WSb)|/ 

[u T ( WSW T + yD y l u]. (14) 

When ( 14) is applied to ( 13), one acquires the desired 
coefficient vector c. At this point, all the quantities in 
(13) and (14) are known except the scalar y and the 
matrix S. These two quantities are used to provide the 
averaging kernel with the proper shape. 

c. Shaping the averaging kernel 

Our ability to accurately fit an averaging kernel vec- 
tor of ( 10) to a given shape vector is limited by the 
number of channels and viewing angles available to 
us. Ultimately, one would like to fit a boxcar function, 
since it represents a uniform average of the layer in 
question. Unfortunately, the limited number of chan- 
nels available to us prevents us from reproducing the 
edges of the nonzero portion of the boxcar function as 
well as the flat portion. Generally, the best one can do 
is to derive a shape similar to a narrowed weighting 
function. Other shapes are easier to fit. For example, 
in the next section we will demonstrate the use of 
Gaussian functions as well as weighting functions of 
different sensors. Note that it is immaterial for climate 
and global change studies that the shape of the derived 
averaging kernel is not uniform (i.e., flat) over the layer 
it defines. All that is required is that its shape be known 
with great accuracy and that the layer in question is 
well defined (i.e., the boundaries of the layer are clearly 
delineated with little or no energy leakage contribution 
from outside the boundaries). The algorithm-derived 
averaging kernel will display a good deal of “ringing 1 ’ 
if the shape function is too narrow (i.e., the boxcar is 
too narrow or, in the case of a Gaussian function, the 
value of the standard deviation is too small). Ringing 
is the undesirable phenomenon that, instead of having 
a flat zero response outside the nonzero portion of the 
shape function, one has a set of rapidly decaying pos- 
itive and negative oscillations. There are three mech- 
anisms that allow one to control ringing: First, the 
shape function being fitted cannot have its width too 
narrow; it must have its width at least comparable to 
the full width at half maximum (FWHM) of the 
weighting functions. The other controlling variables 
are the scalar y and the matrix S. 

The most important thing to realize about y and S 
is that they play competing roles (i.e., at all times a 
trade-off situation exists between them). To see this, 
consider separately the limiting cases where these 
quantities are set to zero. First, when y = 0 in ( 13) 
and (14), the error variance constraint disappears; in 



Fig. 8. Averaging kernels for the adjacent field of view deep-layer 
mean temperatures. Plot B shows the nominal averaging kernel ob- 
tained by fitting weighting functions associated with either beam po- 
sitions 4 and 5 or beam positions 7 and 8 (view angles 3 and 4). Plots 
A and C-E show r the fit of the averaging kernels based on the re- 
maining pairing of beam positions to the nominal averaging kernel. 


this case the fit to the desired shape is most realistically 
(i.e., optimally) determined, subject of course to the 
required coefficient normalization constraint. To see 
the role of the matrix S in the limiting case y = 0, 
assume the usual practical situation where S is a di- 
agonal matrix. Assigning a relatively large value to the 
/ th diagonal element of S, will cause the / th point of 
the averaging kernel to fit the shape vector better at 
the expense of the other points in the fit. This is equiv- 
alent to saying that the / th element of the residual vec- 
tor W 7 c — b in ( 1 1 ) will be relatively smaller. On the 
other hand, if S is set equal to the identity matrix, all 
the points will be fitted to the shape vector with equal 
weight. 

While the shape-vector fit is best when 7=0, the 
coefficients in the vector c of ( 13) no longer have a 
constraint on their size, and so a 2 in (9) can grow 
without bound. Conversely, assume the other limiting 
case in which S is the matrix having all elements zero 
(i.e., S = 0). In this case, the shape constraint disap- 
pears and so the error o 2 of (9) is minimized in ( 1 1 ). 
If the errors of each element in T are identical, D in 
(8) becomes the identity matrix multiplied by a con- 
stant and the coefficient for each channel is 1 In. In 
other words, each channel has equal weight. But now 
there is no control on the shape or size of the averaging 
kernel. This also is an untenable situation. Clearly then, 
neither S nor y can be zero; there must be a trade-off 
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Fig. 8. ( Continued ) 


between their magnitudes in order to achieve a satis- 
factory balance between an acceptable averaging kernel 
shape and an acceptable error level in the deep-layer 
mean temperature. 

The best strategy we found when using a boxcar 
constraint is to define S as a diagonal matrix with values 
of zero in the nonzero elements of the boxcar and val- 
ues of one elsewhere. This will tend to force the aver- 
aging kernel to be zero outside the boxcar. For a 


Gaussian function, we found that simply defining S to 
be the identity matrix produced desirable results. The 
reason S is not critical for a Gaussian function is prob- 
ably due to the smooth transition to zero from the 
Gaussian's maxima. Figure 2 demonstrates the influ- 
ence of the y parameter for fitting a Gaussian shape 
constraint (dashed curve) from MSU channels 2, 3, 
and 4 weighting functions at all view angles. There are 
six averaging kernels for six different values of 7 . The 
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Fig. 9. Comparison of AMSU and MSLJ 
(dotted curves) weighting functions. 


7 parameter was given a value of unity to produce the 
curve with the least Gaussian shape and then reduced 
for each of the remaining five permutations by an order 
of magnitude. The curve best approximating the 
Gaussian used a value of 1.0 X 10 7 for y. Equation 
(3) was used to compute the noise of the deep-layer 
mean associated with each averaging kernel. In this 
study, the noise for all channels is always assumed to 
be 0.33 K. The deep-layer mean temperature noise as 
a function of y as well as the sample size required to 
reduce the noise to 0.1 K are shown in Fig. 3 (since 
the noise is assumed to be random, the noise will be 
reduced by the square root of the sample size). Clearly, 
there is a trade-off between the goodness of fit and the 
product noise. The best fit cannot be used because the 
noise is too large. The averaging kernel associated with 
the lowest noise is too broad. A good compromise is 
the averaging kernel associated with a 0.0001 value for 
7, because 1 ) the noise is at an acceptable level, 2 ) the 
shape is similar to the shape constraint, and 3) there 
is no ringing. 

There is no exact recipe to derive the optimum av- 
eraging kernel and its associated coefficients. The fol- 
lowing is the methodology that we use. First, begin by 
selecting a desired width and mean height for the shape 
function. By trial and error we found that 0.001 is a 
good maximum value for 7. If ringing occurs, gradually 
widen the function until the ringing subsides. Then 
fine tune 7 by reducing it until a more desirable shape 
of the averaging kernel is achieved. Remember, if 7 is 
chosen too small, the elements of the coefficient vector 


of ( 13) will be too large, which will amplify the total 
error a 2 in (9). The user would need to decide an ac- 
ceptable error level. Note that if the deep-layer means 
are averaged over large temporal and/or spatial do- 
mains, then some allowance can be made in the in- 
dividual error level since averaging reduces the noise 
by the square root of the sample size. 

3. Application 

In this section, we will demonstrate the use of this 
algorithm to derive MSU averaging kernels of deep- 
layer mean temperatures that we believe would be op- 
timal for climate studies. AMSU averaging kernels will 
also be shown; however, the emphasis will be to dem- 
onstrate that AMSU measurements can be used to 
continue time series based on MSU measurements. 

a . MSU averaging kernels 

The Spencer and Christy ( 1992b) averaging kernel 
shown in Fig. 1 was produced by combining MSU 
channel 2 weighting functions associated with view an- 
gles 3 through 6. The Spencer product is widely known 
as MSU-2R. The MSU-2R averaging kernel was com- 
puted using coefficient values of 2.0 for angles 3 and 

4, and —1.5 for angles 5 and 6. The first experiment 
with our algorithm was to adjust the size of the boxcar 
and the 7 parameter so as to yield an averaging kernel 
most similar to the one produced by Spencer. As was 
discussed in the previous section, there are not enough 
channels to reproduce a boxcar. Of course, for global 



Fig. 10. Actual and AMSU-derived MSU channel 2 weighting 
functions for surface emissivities of (a) 1 .0 and (b) 0.5. 
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Fig. 1 1. Actual and AMSU-derived MSU-2R averaging 
kernels for surface emissivity of 1.0. 


change purposes the averaging kernel does not have to 
be a boxcar; all that is necessary is for the averaging 
kernel be known. Our averaging kernel is shown along 
with Spencer’s ( dotted curve) in Fig. 4. These averaging 
kernels are similar. The difference is that one was de- 
termined subjectively and the other quantitatively using 
the algorithm of section 2. The numerical values in 
the columns labeled msu2, msu3, and msu4 are the 
derived coefficients (i.e., the tv). Each channel is as- 
sociated with six coefficients, one for each view angle, 
beginning with view angle 1 (nadir). Hence, there is a 
potential maxima of 18 channels. It is seen that the 
only nonzero coefficients are at view angles 3 through 
6 for channel 2. Also shown in the figure is the sum of 
the square of the coefficients, the noise of the product, 
the required sample size to reduce the noise of the 
product to 0.1 K, and the integrated difference in de- 
grees Kelvin. The integrated difference is scalar product 
of the difference between the shape vector and the de- 
rived averaging kernel vector and a standard midlati- 
tude temperature profile (vector). Note that the shape 
constraints and weighting functions used by the algo- 
rithm are defined at 100 levels equally spaced in log 
pressure. 

The advantage of using an algorithm to objectively 
determine the coefficients becomes quite clear if instead 
of using four measurements to produce an averaging 
kernel, all 18 effective channels are considered. For 
example, the averaging kernel given by the solid curve 
in Fig. 5 was obtained by using all view angles of MSU 
channels 2, 3, and 4, and hence all coefficients are non- 


zero. This kernel is more desirable for monitoring tem- 
perature in the lower troposphere than the other av- 
eraging kernels shown in Fig. 4 since there is far less 
signal from the surface. 

As participants in the NOAA/NASA Pathfinder 
program (Ohring and Dodge 1992 ), we are planning 
to construct time series of two types of deep-layer mean 
temperatures covering the entire MSU archive. We will 
provide additional atmospheric layers to the ones given 
by Spencer and colleagues. The first type is referred to 
as scan line products, since observations from all view 
angles are to be used simultaneously. For the second 
type, observations from adjacent FOVs are used. The 
remainder of this section is devoted to a discussion of 
these two product types. 

The scan line products are derived from using 
Gaussian shape constraints. There will be a total of six 
different deep layers, their averaging kernels are shown 
in Fig. 6. The values of the product noise associated 
with the six averaging kernels beginning with the high- 
est peaking one are 0.89, 1.36, 0.76, 1.13, 0.62, and 
0.8 1 K. Each Gaussian function had a standard devia- 
tion of six pressure levels, beginning at pressure level 
67 ( 100 mb) and were separated by six levels. The use 
of Gaussian curves for the shape vector has the advan- 
tage that one can better dictate the location and shape 
of the derived averaging kernel. Because of the limited 
number of channels on the MSU sounder, averaging 
kernels confined solely to the stratosphere cannot be 
provided. Note that these averaging kernels are rela- 
tively narrow in comparison with the raw' weighting 



Fig. 12. Example of nadir Gaussian-derived AMSU averaging 
kernels in the troposphere and lower and upper stratosphere. 
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functions and could have been centered anywhere in 
the troposphere without excessive ringing. 

The scan line product’s poor horizontal resolution, 
which is on the order of 1 100 X 150 km 2 (the mean 
area of scan line projected on the earth on either side 
of nadir), results in relatively poor sampling. To mon- 
itor small temperature fluctuations, these products will 
need to be averaged over relatively large spatial and 
temporal domains to reduce the product noise. One 
suggestion is to average over 10° latitude bands and 
for time intervals on the order of a month. The sample 
size will be about 10 000 ( assumes 2 products per scan 
line, 220 scan lines per orbit, 14 orbits per day). The 
single largest product noise of 1.36 K will be reduced 
to a precision of 0.0 1 36 K. 

For regional climate monitoring the horizontal res- 
olution of the product needs to be much smaller. To 
do this we want to derive products that are ideally based 
on a single FOV. In other words, 1 1 products for the 
1 1 FOVs along the MSU scan line. There are different 
ways to do this. To apply the same set of coefficients 
to all view angles, the off-nadir measurements need to 
be adjusted to look as if they were observed at nadir 
(i.e., limb correct the measurements). For example, 
one can collect a large ensemble of measurements for 
all FOVs and compute regression coefficients using the 
measurements observed at a given FOV as the predic- 
tors and measurements observed at nadir as the pre- 
dictands (Wark 1993). The MSU-2 and MSU-4 time 
series given in Spencer and Christy ( 1992a, 1993), re- 
spectively, were limb corrected. 

Our approach is not to use a statistical method to 
limb correct, since we believe it is undesirable to adjust 
the measurements based on historical data. An attempt 
was made to physically limb correct the MSU by using 
the algorithm to compute coefficients for combining 
weighting functions at a particular off-nadir view angle 
to fit the nadir-viewing weighting functions. Unfortu- 
nately, this technique did not work well at the larger 
view angles. We also tried to compute a different set 
of coefficients for each view angle in order to fit a com- 
mon averaging kernel. However, a combination based 
on only three channels was insufficient to maintain the 
same averaging kernel along the scan line. The solution 
was to use information from a pair of adjacent FOVs, 
which provides a total of six weighting functions to fit 
the desired averaging kernel. To better visualize this 
approach, the MSU scan line geometry and the adja- 
cent FOVs used to yield the ten deep-layer mean tem- 
perature products across each scan is given in Fig. 7. 
So instead of a two products per scan line, this tech- 
nique yields ten products. The nominal averaging ker- 
nel (dotted curve) is shown in plot B of Fig. 8. This 
averaging kernel was derived from a boxcar constraint 
and used weighting functions from view angles 2 and 
3, which corresponds either to field of view 4 and 5 or 
7 and 8. The nominal averaging kernel was then used 
as a shape constraint for other angular combinations 



Fig. 13. Gaussian-derived AMSU averaging kernels 
using channels 4-14 at all view angles. 


given in plots A and C-E of Fig. 8. Notice that the fit 
is good for all combinations with the exception of the 
largest off-nadir angles. However, even though the in- 
tegrated difference for that combination is 2.0 degrees 
we found that the frequency distribution of each of the 
ten products is very similar and they differ only by an 
offset. For each of the ten products its mean condition 
will be subtracted when the data is analyzed. Hence, 
the bias for a given combination is removed. 

b. Continuing the MSU time series with AMSU 

Nadir-viewing weighting functions for AMSU chan- 
nels 4 through 15, along with those from the MSU 
(dotted curves), are given in Fig. 9. Note that for 
AMSU there is actually no nadir position, the nearest 
off-nadir angle is 3.33 degrees. Of utmost concern is 
the ability to continue the record of tropospheric tem- 
perature trends established with MSU channel 2 and 
other linear combinations of the MSU channels. The 
AMSU channel that is most similar to MSU channel 
2 is AMSU channel 5 (53.596 GHz). However, this 
channel is slightly more sensitive to the lower atmo- 
sphere and has a larger surface contribution, thereby 
producing a different signal. A change in the deep-layer 
mean temperature weighting function could conceiv- 
ably produce a spurious signal in the time series based 
solely on MSU channel 2 and AMSU brightness tem- 
peratures. 

The question to be answered is can this algorithm 
reproduce the MSU channel 2 weighting function from 
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the AMSU channel weighting functions? The answer 
to this question is yes. In Fig. 10 there are four curves. 
The curves to the left are the actual MSU channel 2 
(solid curve) and the AMSU reconstructed MSU 
channel 2 (dotted curve) for a surface emissivity of 
1 .0. The other set of curves is for an emissivity of 0.5. 
The shape is different because the emissivity enters into 
the computation of the weighting functions. Only the 
“nadir” AMSU channels 4 through 7 weighting func- 
tions were used in deriving the coefficients. The selected 
emissivities are the extremes values for the surface 
emissivity in the 50-GHz band. The actual and recon- 
structed MSU channel 2 weighting functions are vir- 
tually identical. It is very important to note that the 
coefficients, based on an emissivity of 1.0, were used 
for reconstructing the weighting functions for an emis- 
sivity of 0.5. In other words, the reconstruction of MSU 
channel 2 is insensitive to surface emissivity, which is 
very important since the estimation of surface emis- 
sivity would add uncertainty to the final product. By 
using the appropriate 7, the integrated error between 
the real and reconstructed MSU channel 2 weighting 
function can be forced to be virtually zero. If we did 
nothing and simply used AMSU channel 5 to continue 
MSU channel 2, there would be a sizable airmass de- 
pendent bias. For a summer midlatitude atmosphere, 
the bias would be about 5.6 K. The linear combination 
of AMSU to yield an equivalent MSU channel 2 mea- 
surement is 

7msu2 = “0.0488 7 T amsu4 T 0.9327; msu5 

+ 0.208 7’ amsu6 - 0.466 T amsu7 . (15) 

Spencer’s MSU-2R product can also be reproduced 
from the AMSU. AMSU channels 4 through 8 using 
10 angles ranging from 18.66 to 49.55 degrees were 
combined to fit the MSU-2R averaging kernel. The 
AMSU equivalence of MSU-2R is given in Fig. 11. 
The coefficients are obtainable from the author. 

Even though the accuracy of fitting AMSU to MSU 
appears to be high, the underlying assumption is that 
the weighting functions are known exactly. In practice, 
we know this is not true. Therefore, in conjunction 
with this algorithm, overlap of MSU and AMSU will 
be needed to adjust for the component that is left over 
after the “known” physics have been accounted for. 

The AMSU by itself will be a very important sensor 
for monitoring temperature trends throughout the at- 
mosphere. Its numerous channels will enable one to 
monitor temperature in three important regions of the 
atmosphere: the upper and lower stratosphere and the 
troposphere. Figure 12 shows examples of AMSU av- 
eraging kernels in these three regions. All were derived 
from initial Gaussian curves using only nadir mea- 
surements. Narrower averaging kernels can be achieved 
by utilizing off-nadir measurements. The technique 
used to generate the six averaging kernels, shown in 
Fig. 6, was applied to AMSU channels 5 through 14 


weighting functions at all view angles. The result, 
shown in Fig. 13, clearly demonstrates that the ability 
to derive these averaging kernels is no longer restricted 
to the troposphere. It is also important to mention that 
the lowest six averaging kernels in Fig. 13 are virtually 
identical to the six averaging kernels shown in Fig. 6. 
Therefore, in addition to Spencer’s time series of MSU, 
we will be able to extend our own time series with 
AMSU. 

4. Summary 

An algorithm for deriving deep-layer mean temper- 
atures from microwave sensors has been developed. 
The algorithm, in conjunction with the microwave 
channels considered in this study, is completely inde- 
pendent of a priori information. Independence from 
ancillary data is critical for high-precision monitoring 
of climate trends, so that any observed trends in the 
deep-layer mean temperatures arc attributed only to 
trends in the sensor’s measurements. The algorithm 
also has been shown to be capable of combining mea- 
surements from next-generation microwave sensors to 
reconstruct measurements from current sensors. This 
enables one to generate continuous time series of sat- 
ellite-derived temperature trends accurately, regardless 
of changes in satellite instrumentation. 

The next step is to produce the actual MSU time 
series from the two types of deep-layer mean temper- 
atures we plan to derive as part of the TOVS Pathfinder 
project. The first type will yield six different atmo- 
spheric deep-layer mean temperatures; their averaging 
kernels were shown in Fig. 6. The second type uses 
adjacent angular combinations to yield a single at- 
mospheric averaging kernel. The important feature of 
the second type is that for each adjacent combination, 
the averaging kernel along the scan line is preserved 
so that limb correcting the measurements can be 
avoided. 
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